Goto

Collaborating Authors

 stable and robust addernet


Towards Stable and Robust AdderNets

Neural Information Processing Systems

Adder neural network (AdderNet) replaces the original convolutions with massive multiplications by cheap additions while achieving comparable performance thus yields a series of energy-efficient neural networks. Compared with convolutional neural networks (CNNs), the training of AdderNets is much more sophisticated including several techniques for adjusting gradient and batch normalization. In addition, variances of both weights and activations in resulting adder networks are very enormous which limits its performance and the potential for applying to other tasks. To enhance the stability and robustness of AdderNets, we first thoroughly analyze the variance estimation of weight parameters and output features of an arbitrary adder layer. Then, we develop a weight normalization scheme for adaptively optimizing the weight distribution of AdderNets during the training procedure, which can reduce the perturbation on running mean and variance in batch normalization layers. Meanwhile, the proposed weight normalization can also be utilized to enhance the adversarial robustness of resulting networks. Experiments conducted on several benchmarks demonstrate the superiority of the proposed approach for generating AdderNets with higher performance.


Towards Stable and Robust AdderNets (Supplementary Material) Minjing Dong 1,2, Yunhe Wang

Neural Information Processing Systems

As shown in Table 1, the stability of adversarial robustness is evaluated under different inference settings. We first show whether the shuffle of test set influences the performance. In the main body, we mainly focus on the comparison with CNN since we want to highlight the natural robustness of AdderNet compared to CNN under the same setting. We further evaluate the performance of A WN on CNNs. As shown in Table 3, with the involvement of A WN, CNN obtains slight better adversarial robustness.


Towards Stable and Robust AdderNets

Neural Information Processing Systems

Adder neural network (AdderNet) replaces the original convolutions with massive multiplications by cheap additions while achieving comparable performance thus yields a series of energy-efficient neural networks. Compared with convolutional neural networks (CNNs), the training of AdderNets is much more sophisticated including several techniques for adjusting gradient and batch normalization. In addition, variances of both weights and activations in resulting adder networks are very enormous which limits its performance and the potential for applying to other tasks. To enhance the stability and robustness of AdderNets, we first thoroughly analyze the variance estimation of weight parameters and output features of an arbitrary adder layer. Then, we develop a weight normalization scheme for adaptively optimizing the weight distribution of AdderNets during the training procedure, which can reduce the perturbation on running mean and variance in batch normalization layers.